A robust front-end algorithm for distributed speech recognition
نویسندگان
چکیده
This paper presents the robust front-end algorithm that was submitted by Motorola to the ETSI STQ-Aurora DSR working group as a proposal for the Advanced DSR front-end in January 2001. The algorithm consists of a two-stage melwarped Wiener filter, a waveform processor, a channelnormalized mel-frequency cepstral calculation and a subsystem of post-cepstral processing according to the reliability of mel-spectral components, etc. The output of this algorithm, a set of Mel-Frequency Cepstral Coefficients (MFCC), is compressed and encoded based on ETSI ES 201 108 standard; and then it is transmitted at 4800 bps. Compared to ETSI standard MFCC front-end, the proposed algorithm delivers an improvement of 42.64% in performance on the Aurora 2 database, which is required by this Eurospeech Special Event. With a very simple frame deletion algorithm based on a Voice-Onset Detection (VOD), the improvements were significantly boosted to 47.58% on the same database. In this paper we also give further insights about the proposal by providing performances and analyses with the Aurora SpeechDat-Car databases.
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملEnhancement of noisy speech for noise robust front-end and speech reconstruction at back-end of DSR system
This paper presents a speech enhancement method for noise robust front-end and speech reconstruction at the back-end of Distributed Speech Recognition (DSR). The speech noise removal algorithm is based on a two stage noise filtering LSAHT by log spectral amplitude speech estimator (LSA) and harmonic tunneling (HT) prior to feature extraction. The noise reduced features are transmitted with some...
متن کاملEvaluation of Robust Speech Recognitio Speech Recognition in a Noisy Aut
In this paper, we evaluate the performance of several robust speech recognition algorithms in a noisy automobile environment as characterized by the Finnish SpeechDat–Car ASR task [1]. By applying acoustic feature compensation, model compensation, and speech detection algorithms to this task, a 51% reduction in word error rate (WER) was obtained relative to the ETSI standard ASR front–end. In a...
متن کاملCombined speech enhancement and auditory modelling for robust distributed speech recognition
The performance of Automatic Speech Recognition (ASR) systems in the presence of noise is an area that has attracted a lot of research interest. Additive noise from interfering noise sources, and convolutional noise arising from transmission channel characteristics both contribute to a degradation of performance in ASR systems. This paper addresses the problem of robustness of speech recognitio...
متن کاملR@7à3spgp3à7vh7pae7fr3dà8p3e7ugpcà8gpàr@7àh7p8gpe3f57 7t3ds3ragfàg8àqh775@àp75g9faragfàqwqr7eqàsf67pàfgaqw 5gf6aragfq @ex9¼xià@sgr
This paper describes a database designed to evaluate the performance of speech recognition algorithms in noisy conditions. The database may either be used for the evaluation of front-end feature extraction algorithms using a defined HMM recognition back-end or complete recognition systems. The source speech for this database is the TIdigits, consisting of connected digits task spoken by America...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001